Overview
Brought to you by YData
Dataset statistics
| Number of variables | 9 |
|---|---|
| Number of observations | 3647 |
| Missing cells | 3692 |
| Missing cells (%) | 11.2% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.9 MiB |
| Average record size in memory | 549.4 B |
Variable types
| Numeric | 2 |
|---|---|
| Text | 3 |
| Categorical | 3 |
| Unsupported | 1 |
CUIT is highly overall correlated with estadoCivil and 3 other fields | High correlation |
estadoCivil is highly overall correlated with CUIT | High correlation |
nacionalidad is highly overall correlated with CUIT | High correlation |
nroDoc is highly overall correlated with CUIT | High correlation |
tipoDoc is highly overall correlated with CUIT | High correlation |
tipoDoc is highly imbalanced (97.1%) | Imbalance |
nacionalidad is highly imbalanced (94.9%) | Imbalance |
nacionalidad has 44 (1.2%) missing values | Missing |
emailAlternativo has 3647 (100.0%) missing values | Missing |
nroDoc is highly skewed (γ1 = 23.31455586) | Skewed |
CUIT has unique values | Unique |
emailAlternativo is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
| Analysis started | 2025-03-24 22:35:27.785601 |
|---|---|
| Analysis finished | 2025-03-24 22:37:14.159568 |
| Duration | 1 minute and 46.37 seconds |
| Software version | ydata-profiling vv4.15.1 |
| Download configuration | config.json |
Variables
CUIT
Real number (ℝ)
High correlation  Unique 
| Distinct | 3647 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.2205399 × 1010 |
| Minimum | 2.0017346 × 1010 |
|---|---|
| Maximum | 2.795928 × 1010 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 32.2 KiB |
Quantile statistics
| Minimum | 2.0017346 × 1010 |
|---|---|
| 5-th percentile | 2.0083137 × 1010 |
| Q1 | 2.0182 × 1010 |
| median | 2.0287289 × 1010 |
| Q3 | 2.3379382 × 1010 |
| 95-th percentile | 2.7309133 × 1010 |
| Maximum | 2.795928 × 1010 |
| Range | 7.941934 × 109 |
| Interquartile range (IQR) | 3.1973826 × 109 |
Descriptive statistics
| Standard deviation | 2.9741399 × 109 |
|---|---|
| Coefficient of variation (CV) | 0.13393769 |
| Kurtosis | -0.86613529 |
| Mean | 2.2205399 × 1010 |
| Median Absolute Deviation (MAD) | 1.5188549 × 108 |
| Skewness | 0.98362253 |
| Sum | 8.0983091 × 1013 |
| Variance | 8.8455083 × 1018 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 27331126530 | 1 | < 0.1% |
| 27236909900 | 1 | < 0.1% |
| 20305924076 | 1 | < 0.1% |
| 20082883240 | 1 | < 0.1% |
| 20230506060 | 1 | < 0.1% |
| 20141208269 | 1 | < 0.1% |
| 20271472901 | 1 | < 0.1% |
| 20302787221 | 1 | < 0.1% |
| 20103522324 | 1 | < 0.1% |
| 20264053367 | 1 | < 0.1% |
| Other values (3637) | 3637 |
| Value | Count | Frequency (%) |
| 20017345738 | 1 | |
| 20041217643 | 1 | |
| 20041398125 | 1 | |
| 20041722720 | 1 | |
| 20041901412 | 1 | |
| 20041956799 | 1 | |
| 20042053229 | 1 | |
| 20042977226 | 1 | |
| 20042986624 | 1 | |
| 20043172787 | 1 |
| Value | Count | Frequency (%) |
| 27959279778 | 1 | |
| 27954188715 | 1 | |
| 27953768521 | 1 | |
| 27949584165 | 1 | |
| 27949461659 | 1 | |
| 27947115893 | 1 | |
| 27946720556 | 1 | |
| 27946027389 | 1 | |
| 27941038951 | 1 | |
| 27940657089 | 1 |
Nombre
Text
| Distinct | 3015 |
|---|---|
| Distinct (%) | 82.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 255.0 KiB |
Length
| Max length | 51 |
|---|---|
| Median length | 30 |
| Mean length | 12.347409 |
| Min length | 3 |
Unique
| Unique | 2684 ? |
|---|---|
| Unique (%) | 73.6% |
Sample
| 1st row | edit mabel |
|---|---|
| 2nd row | EZEQUIEL MARTÍN |
| 3rd row | HORACIO MIGUEL |
| 4th row | Alfredo |
| 5th row | GABRIEL |
| Value | Count | Frequency (%) |
| carlos | 156 | 2.3% |
| daniel | 148 | 2.2% |
| maria | 146 | 2.1% |
| juan | 140 | 2.1% |
| luis | 139 | 2.0% |
| alberto | 134 | 2.0% |
| jose | 128 | 1.9% |
| alejandro | 120 | 1.8% |
| jorge | 105 | 1.5% |
| eduardo | 103 | 1.5% |
| Other values (853) | 5483 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 3664 | 8.1% |
| 3155 | 7.0% | |
| a | 3086 | 6.9% |
| R | 2131 | 4.7% |
| E | 2121 | 4.7% |
| O | 2010 | 4.5% |
| I | 1867 | 4.1% |
| i | 1821 | 4.0% |
| r | 1783 | 4.0% |
| e | 1760 | 3.9% |
| Other values (55) | 21633 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 45031 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| A | 3664 | 8.1% |
| 3155 | 7.0% | |
| a | 3086 | 6.9% |
| R | 2131 | 4.7% |
| E | 2121 | 4.7% |
| O | 2010 | 4.5% |
| I | 1867 | 4.1% |
| i | 1821 | 4.0% |
| r | 1783 | 4.0% |
| e | 1760 | 3.9% |
| Other values (55) | 21633 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 45031 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| A | 3664 | 8.1% |
| 3155 | 7.0% | |
| a | 3086 | 6.9% |
| R | 2131 | 4.7% |
| E | 2121 | 4.7% |
| O | 2010 | 4.5% |
| I | 1867 | 4.1% |
| i | 1821 | 4.0% |
| r | 1783 | 4.0% |
| e | 1760 | 3.9% |
| Other values (55) | 21633 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 45031 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| A | 3664 | 8.1% |
| 3155 | 7.0% | |
| a | 3086 | 6.9% |
| R | 2131 | 4.7% |
| E | 2121 | 4.7% |
| O | 2010 | 4.5% |
| I | 1867 | 4.1% |
| i | 1821 | 4.0% |
| r | 1783 | 4.0% |
| e | 1760 | 3.9% |
| Other values (55) | 21633 |
Apellido
Text
| Distinct | 3049 |
|---|---|
| Distinct (%) | 83.6% |
| Missing | 1 |
| Missing (%) | < 0.1% |
| Memory size | 235.1 KiB |
Length
| Max length | 51 |
|---|---|
| Median length | 26 |
| Mean length | 7.4561163 |
| Min length | 2 |
Unique
| Unique | 2760 ? |
|---|---|
| Unique (%) | 75.7% |
Sample
| 1st row | RETAMAR |
|---|---|
| 2nd row | DAWIDOWSKI |
| 3rd row | ESPOSITO |
| 4th row | Sampedro |
| 5th row | ALSO |
| Value | Count | Frequency (%) |
| de | 55 | 1.3% |
| gonzalez | 45 | 1.1% |
| fernandez | 36 | 0.9% |
| garcia | 33 | 0.8% |
| rodriguez | 31 | 0.8% |
| gomez | 29 | 0.7% |
| lopez | 23 | 0.6% |
| diaz | 21 | 0.5% |
| martinez | 21 | 0.5% |
| perez | 19 | 0.5% |
| Other values (2805) | 3765 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 1976 | 7.3% |
| a | 1690 | 6.2% |
| E | 1320 | 4.9% |
| R | 1318 | 4.8% |
| e | 1235 | 4.5% |
| O | 1164 | 4.3% |
| I | 1123 | 4.1% |
| o | 1120 | 4.1% |
| r | 1103 | 4.1% |
| i | 1102 | 4.1% |
| Other values (57) | 14034 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 27185 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| A | 1976 | 7.3% |
| a | 1690 | 6.2% |
| E | 1320 | 4.9% |
| R | 1318 | 4.8% |
| e | 1235 | 4.5% |
| O | 1164 | 4.3% |
| I | 1123 | 4.1% |
| o | 1120 | 4.1% |
| r | 1103 | 4.1% |
| i | 1102 | 4.1% |
| Other values (57) | 14034 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 27185 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| A | 1976 | 7.3% |
| a | 1690 | 6.2% |
| E | 1320 | 4.9% |
| R | 1318 | 4.8% |
| e | 1235 | 4.5% |
| O | 1164 | 4.3% |
| I | 1123 | 4.1% |
| o | 1120 | 4.1% |
| r | 1103 | 4.1% |
| i | 1102 | 4.1% |
| Other values (57) | 14034 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 27185 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| A | 1976 | 7.3% |
| a | 1690 | 6.2% |
| E | 1320 | 4.9% |
| R | 1318 | 4.8% |
| e | 1235 | 4.5% |
| O | 1164 | 4.3% |
| I | 1123 | 4.1% |
| o | 1120 | 4.1% |
| r | 1103 | 4.1% |
| i | 1102 | 4.1% |
| Other values (57) | 14034 |
tipoDoc
Categorical
High correlation  Imbalance 
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 213.8 KiB |
| DNI | |
|---|---|
| LC | 9 |
| LE | 7 |
| CI | 5 |
| Pasaporte | 2 |
Length
| Max length | 9 |
|---|---|
| Median length | 3 |
| Mean length | 2.9975322 |
| Min length | 2 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | DNI |
|---|---|
| 2nd row | DNI |
| 3rd row | DNI |
| 4th row | DNI |
| 5th row | DNI |
Common Values
| Value | Count | Frequency (%) |
| DNI | 3624 | |
| LC | 9 | 0.2% |
| LE | 7 | 0.2% |
| CI | 5 | 0.1% |
| Pasaporte | 2 | 0.1% |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| dni | 3624 | |
| lc | 9 | 0.2% |
| le | 7 | 0.2% |
| ci | 5 | 0.1% |
| pasaporte | 2 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| I | 3629 | |
| D | 3624 | |
| N | 3624 | |
| L | 16 | 0.1% |
| C | 14 | 0.1% |
| E | 7 | 0.1% |
| a | 4 | < 0.1% |
| P | 2 | < 0.1% |
| s | 2 | < 0.1% |
| p | 2 | < 0.1% |
| Other values (4) | 8 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 10932 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| I | 3629 | |
| D | 3624 | |
| N | 3624 | |
| L | 16 | 0.1% |
| C | 14 | 0.1% |
| E | 7 | 0.1% |
| a | 4 | < 0.1% |
| P | 2 | < 0.1% |
| s | 2 | < 0.1% |
| p | 2 | < 0.1% |
| Other values (4) | 8 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 10932 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| I | 3629 | |
| D | 3624 | |
| N | 3624 | |
| L | 16 | 0.1% |
| C | 14 | 0.1% |
| E | 7 | 0.1% |
| a | 4 | < 0.1% |
| P | 2 | < 0.1% |
| s | 2 | < 0.1% |
| p | 2 | < 0.1% |
| Other values (4) | 8 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 10932 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| I | 3629 | |
| D | 3624 | |
| N | 3624 | |
| L | 16 | 0.1% |
| C | 14 | 0.1% |
| E | 7 | 0.1% |
| a | 4 | < 0.1% |
| P | 2 | < 0.1% |
| s | 2 | < 0.1% |
| p | 2 | < 0.1% |
| Other values (4) | 8 | 0.1% |
nroDoc
Real number (ℝ)
High correlation  Skewed 
| Distinct | 3646 |
|---|---|
| Distinct (%) | > 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 72270510 |
| Minimum | 1080 |
|---|---|
| Maximum | 2.7313075 × 1010 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 32.2 KiB |
Quantile statistics
| Minimum | 1080 |
|---|---|
| 5-th percentile | 6745894.6 |
| Q1 | 14567634 |
| median | 23335997 |
| Q3 | 29322806 |
| 95-th percentile | 39720863 |
| Maximum | 2.7313075 × 1010 |
| Range | 2.7313074 × 1010 |
| Interquartile range (IQR) | 14755173 |
Descriptive statistics
| Standard deviation | 1.0493586 × 109 |
|---|---|
| Coefficient of variation (CV) | 14.519873 |
| Kurtosis | 551.53392 |
| Mean | 72270510 |
| Median Absolute Deviation (MAD) | 6881368 |
| Skewness | 23.314556 |
| Sum | 2.6357055 × 1011 |
| Variance | 1.1011535 × 1018 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 31248337 | 2 | 0.1% |
| 18507057 | 1 | < 0.1% |
| 33112653 | 1 | < 0.1% |
| 23690990 | 1 | < 0.1% |
| 30592407 | 1 | < 0.1% |
| 8288324 | 1 | < 0.1% |
| 23050606 | 1 | < 0.1% |
| 14120826 | 1 | < 0.1% |
| 27147290 | 1 | < 0.1% |
| 30278722 | 1 | < 0.1% |
| Other values (3636) | 3636 |
| Value | Count | Frequency (%) |
| 1080 | 1 | |
| 774444 | 1 | |
| 1734573 | 1 | |
| 1791154 | 1 | |
| 2196200 | 1 | |
| 2654011 | 1 | |
| 2935023 | 1 | |
| 3142258 | 1 | |
| 3427647 | 1 | |
| 3490344 | 1 |
| Value | Count | Frequency (%) |
| 27313074795 | 1 | |
| 27277864458 | 1 | |
| 27269490603 | 1 | |
| 23173177844 | 1 | |
| 20936640223 | 1 | |
| 20309591039 | 1 | |
| 20068691711 | 1 | |
| 2024908763 | 1 | |
| 526312777 | 1 | |
| 394282931 | 1 |
nacionalidad
Categorical
High correlation  Imbalance  Missing 
| Distinct | 24 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 44 |
| Missing (%) | 1.2% |
| Memory size | 235.0 KiB |
| Argentina | |
|---|---|
| Italia | 9 |
| Paraguay | 8 |
| España | 7 |
| Bolivia | 7 |
| Other values (19) | 45 |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 8.9483764 |
| Min length | 4 |
Unique
| Unique | 9 ? |
|---|---|
| Unique (%) | 0.2% |
Sample
| 1st row | Argentina |
|---|---|
| 2nd row | Argentina |
| 3rd row | Argentina |
| 4th row | Argentina |
| 5th row | Argentina |
Common Values
| Value | Count | Frequency (%) |
| Argentina | 3527 | |
| Italia | 9 | 0.2% |
| Paraguay | 8 | 0.2% |
| España | 7 | 0.2% |
| Bolivia | 7 | 0.2% |
| Perú | 6 | 0.2% |
| Brasil | 6 | 0.2% |
| Uruguay | 5 | 0.1% |
| Venezuela | 4 | 0.1% |
| Chile | 4 | 0.1% |
| Other values (14) | 20 | 0.5% |
| (Missing) | 44 | 1.2% |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| argentina | 3527 | |
| italia | 9 | 0.2% |
| paraguay | 8 | 0.2% |
| españa | 7 | 0.2% |
| bolivia | 7 | 0.2% |
| perú | 6 | 0.2% |
| brasil | 6 | 0.2% |
| uruguay | 5 | 0.1% |
| venezuela | 4 | 0.1% |
| chile | 4 | 0.1% |
| Other values (14) | 20 | 0.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 7068 | |
| a | 3628 | |
| i | 3576 | |
| r | 3564 | |
| e | 3555 | |
| g | 3543 | |
| t | 3539 | |
| A | 3533 | |
| l | 37 | 0.1% |
| u | 25 | 0.1% |
| Other values (26) | 173 | 0.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 32241 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 7068 | |
| a | 3628 | |
| i | 3576 | |
| r | 3564 | |
| e | 3555 | |
| g | 3543 | |
| t | 3539 | |
| A | 3533 | |
| l | 37 | 0.1% |
| u | 25 | 0.1% |
| Other values (26) | 173 | 0.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 32241 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 7068 | |
| a | 3628 | |
| i | 3576 | |
| r | 3564 | |
| e | 3555 | |
| g | 3543 | |
| t | 3539 | |
| A | 3533 | |
| l | 37 | 0.1% |
| u | 25 | 0.1% |
| Other values (26) | 173 | 0.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 32241 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 7068 | |
| a | 3628 | |
| i | 3576 | |
| r | 3564 | |
| e | 3555 | |
| g | 3543 | |
| t | 3539 | |
| A | 3533 | |
| l | 37 | 0.1% |
| u | 25 | 0.1% |
| Other values (26) | 173 | 0.5% |
email
Text
| Distinct | 3522 |
|---|---|
| Distinct (%) | 96.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 291.1 KiB |
Length
| Max length | 44 |
|---|---|
| Median length | 39 |
| Mean length | 24.691801 |
| Min length | 10 |
Unique
| Unique | 3487 ? |
|---|---|
| Unique (%) | 95.6% |
Sample
| 1st row | edit-retamar@hotmail.com |
|---|---|
| 2nd row | lazarodawi@hotmail.com |
| 3rd row | esposito.h@gmail.com |
| 4th row | servicentrorefrigeracion@yahoo.com.ar |
| 5th row | gabrielalso@outlook.com |
| Value | Count | Frequency (%) |
| artisticacck@gmail.com | 78 | 2.1% |
| artisticagrl@gmail.com | 7 | 0.2% |
| leonardorilo@starnovagroup.com | 7 | 0.2% |
| libreria_centro@hotmail.com | 3 | 0.1% |
| miguel_lamera@hotmail.com | 3 | 0.1% |
| grupolmmendoza@gmail.com | 3 | 0.1% |
| rodriguezs@creditoautomatico.com.ar | 3 | 0.1% |
| brocazfernando@gmail.com | 2 | 0.1% |
| marceseig@hotmail.com | 2 | 0.1% |
| jcasociados@hotmail.com | 2 | 0.1% |
| Other values (3510) | 3537 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 9964 | 11.1% |
| o | 9117 | 10.1% |
| m | 7530 | 8.4% |
| i | 6976 | 7.7% |
| c | 6041 | 6.7% |
| l | 5126 | 5.7% |
| . | 4930 | 5.5% |
| r | 4863 | 5.4% |
| e | 4486 | 5.0% |
| @ | 3647 | 4.0% |
| Other values (56) | 27371 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 90051 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 9964 | 11.1% |
| o | 9117 | 10.1% |
| m | 7530 | 8.4% |
| i | 6976 | 7.7% |
| c | 6041 | 6.7% |
| l | 5126 | 5.7% |
| . | 4930 | 5.5% |
| r | 4863 | 5.4% |
| e | 4486 | 5.0% |
| @ | 3647 | 4.0% |
| Other values (56) | 27371 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 90051 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 9964 | 11.1% |
| o | 9117 | 10.1% |
| m | 7530 | 8.4% |
| i | 6976 | 7.7% |
| c | 6041 | 6.7% |
| l | 5126 | 5.7% |
| . | 4930 | 5.5% |
| r | 4863 | 5.4% |
| e | 4486 | 5.0% |
| @ | 3647 | 4.0% |
| Other values (56) | 27371 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 90051 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 9964 | 11.1% |
| o | 9117 | 10.1% |
| m | 7530 | 8.4% |
| i | 6976 | 7.7% |
| c | 6041 | 6.7% |
| l | 5126 | 5.7% |
| . | 4930 | 5.5% |
| r | 4863 | 5.4% |
| e | 4486 | 5.0% |
| @ | 3647 | 4.0% |
| Other values (56) | 27371 |
emailAlternativo
Unsupported
Missing  Rejected  Unsupported 
| Missing | 3647 |
|---|---|
| Missing (%) | 100.0% |
| Memory size | 28.6 KiB |
estadoCivil
Categorical
High correlation 
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 226.6 KiB |
| Soltero | |
|---|---|
| Casado |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 6.5870579 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Casado |
|---|---|
| 2nd row | Soltero |
| 3rd row | Casado |
| 4th row | Soltero |
| 5th row | Soltero |
Common Values
| Value | Count | Frequency (%) |
| Soltero | 2141 | |
| Casado | 1506 |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| soltero | 2141 | |
| casado | 1506 |
Most occurring characters
| Value | Count | Frequency (%) |
| o | 5788 | |
| a | 3012 | |
| S | 2141 | 8.9% |
| l | 2141 | 8.9% |
| e | 2141 | 8.9% |
| t | 2141 | 8.9% |
| r | 2141 | 8.9% |
| C | 1506 | 6.3% |
| s | 1506 | 6.3% |
| d | 1506 | 6.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 24023 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| o | 5788 | |
| a | 3012 | |
| S | 2141 | 8.9% |
| l | 2141 | 8.9% |
| e | 2141 | 8.9% |
| t | 2141 | 8.9% |
| r | 2141 | 8.9% |
| C | 1506 | 6.3% |
| s | 1506 | 6.3% |
| d | 1506 | 6.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 24023 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| o | 5788 | |
| a | 3012 | |
| S | 2141 | 8.9% |
| l | 2141 | 8.9% |
| e | 2141 | 8.9% |
| t | 2141 | 8.9% |
| r | 2141 | 8.9% |
| C | 1506 | 6.3% |
| s | 1506 | 6.3% |
| d | 1506 | 6.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 24023 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| o | 5788 | |
| a | 3012 | |
| S | 2141 | 8.9% |
| l | 2141 | 8.9% |
| e | 2141 | 8.9% |
| t | 2141 | 8.9% |
| r | 2141 | 8.9% |
| C | 1506 | 6.3% |
| s | 1506 | 6.3% |
| d | 1506 | 6.3% |
Interactions
Correlations
| CUIT | estadoCivil | nacionalidad | nroDoc | tipoDoc | |
|---|---|---|---|---|---|
| CUIT | 1.000 | 1.000 | 1.000 | 0.517 | 1.000 |
| estadoCivil | 1.000 | 1.000 | 0.000 | 0.017 | 0.000 |
| nacionalidad | 1.000 | 0.000 | 1.000 | 0.080 | 0.000 |
| nroDoc | 0.517 | 0.017 | 0.080 | 1.000 | 0.033 |
| tipoDoc | 1.000 | 0.000 | 0.000 | 0.033 | 1.000 |
Missing values
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
Sample
| CUIT | Nombre | Apellido | tipoDoc | nroDoc | nacionalidad | emailAlternativo | estadoCivil | ||
|---|---|---|---|---|---|---|---|---|---|
| 0 | 27236909900 | edit mabel | RETAMAR | DNI | 23690990 | Argentina | edit-retamar@hotmail.com | NaN | Casado |
| 1 | 20305924076 | EZEQUIEL MARTÍN | DAWIDOWSKI | DNI | 30592407 | Argentina | lazarodawi@hotmail.com | NaN | Soltero |
| 2 | 20082883240 | HORACIO MIGUEL | ESPOSITO | DNI | 8288324 | Argentina | esposito.h@gmail.com | NaN | Casado |
| 3 | 20230506060 | Alfredo | Sampedro | DNI | 23050606 | Argentina | servicentrorefrigeracion@yahoo.com.ar | NaN | Soltero |
| 4 | 20141208269 | GABRIEL | ALSO | DNI | 14120826 | Argentina | gabrielalso@outlook.com | NaN | Soltero |
| 5 | 20271472901 | luciano hernan | diz | DNI | 27147290 | Argentina | lucianohdiz@gmail.com | NaN | Soltero |
| 6 | 20302787221 | Martin Hernan | Barbatelli | DNI | 30278722 | Argentina | martin.barbatelli@gmail.com | NaN | Soltero |
| 7 | 20103522324 | Dante Oscar | Riveros | DNI | 10352232 | Argentina | agrimdanteriveros@gmail.com | NaN | Casado |
| 8 | 20264053367 | LEANDRO | BOBADILLA | DNI | 26405336 | Argentina | servisurmdp@gmail.com | NaN | Soltero |
| 9 | 20310857751 | Pablo Martin | de la Cruz | DNI | 31085775 | Argentina | info@pablodelacruzeventos.com | NaN | Soltero |
| CUIT | Nombre | Apellido | tipoDoc | nroDoc | nacionalidad | emailAlternativo | estadoCivil | ||
|---|---|---|---|---|---|---|---|---|---|
| 3637 | 20385356243 | JONATAN MACIEL | DIAZ | DNI | 38535624 | Argentina | macijon10@gmail.com | NaN | Soltero |
| 3638 | 20322310944 | Ezequiel Fernando | Rodríguez | DNI | 32231094 | Argentina | efrrodriguez1724@gmail.com | NaN | Soltero |
| 3639 | 20149549987 | LUIS MARIA | ROBOL | DNI | 14954998 | Argentina | brunorobol@transporterobol.com | NaN | Casado |
| 3640 | 24925253304 | FREDY MARTIN | ALBERTI | DNI | 92525330 | Uruguay | tincho75@yahoo.com | NaN | Soltero |
| 3641 | 20328156742 | José María | Alegre | DNI | 32815674 | Argentina | monachitapapandrew@gmail.com | NaN | Soltero |
| 3642 | 20171591563 | Julio Gustavo | Lazzos | DNI | 17159156 | Argentina | biotecnika@hotmail.com | NaN | Soltero |
| 3643 | 20293290416 | ELIAS MARTIN | SEGURA | DNI | 29329041 | Argentina | e.segura1982@gmail.com | NaN | Soltero |
| 3644 | 20240423759 | FEDERICO MARTIN | NUÑEZ | DNI | 24042375 | Argentina | federico@inspira.ar | NaN | Casado |
| 3645 | 20287286687 | MARIO CEFERINO | LAZARTE | DNI | 28728668 | Argentina | lazarte.events@gmail.com | NaN | Soltero |
| 3646 | 27331126530 | Soledad Anahi | Cerillano | DNI | 33112653 | Argentina | solecerillano@gmail.com | NaN | Soltero |